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Description 

FIELD OF THE INVENTION 

s This invention relates to a biologically pure DNA signal sequence which encodes an amino acid signal 
peptide necessary for directing the secretion from certain defined hosts of proteins in bioactive form. 

BACKGROUND OF THE JNVENTiON 

70 in the biological production of commercially viable proteins by the fermentation of microorganisms, the 
ability to produce the desired proteins by fermentation with secretion of the proteins by the microorganisms 
into the broth is very significant. However, there are many commercially viable proteins encoded by 
genetically engineered DNA constructs which are not secreted by the celts in which the DNA is expressed. 
This often necessitates harvesting the cells, bursting the ceil waifs, recovering the desired proteins in pure 

is form and then chemically re-naturing the pure materia! to restore its bioactive function. This downstream 
processing, as it is called, is illustrated in Figure 1. 

Some ceils and microorganisms carry out the biological equivalent of downstream processing by 
secreting proteins in bioactive form. The mechanism which directs the secretion of some proteins through 
the cell walls is not fully understood. For example, in Strepfomyc.es griseus, an organism used for the 

sm commercial production of Pronase, the species secretes many extra cellular proteins {Jurasek, L., P. 
Johnson, R.W. Ofafson, and L.8. Smiiiie {1971), An improved fractionation system for pronase on CM* 
Sephadex Can. J. Blochem., 49.1 195-1 201). Protease A and protease B. two of the serine proteases 
secreted by S griseus, have sequences which are 61% homologous on the basis of amino acid identity 
(Fujirtaga. M, L.T.J. Qafbaere, G.O. Brayer, and MN.G James {1985), Refined structure ofn-iytic protease 

sb at: 1.7A resolution; Analysis of hyrodgen bonding and solvent sirudere, j, Mol. Bioi„ 183:479-502; Jurasek, 
L, M R Carpenter, LB. Smiiiie, A. Gertier, S. Levy, and L.H, Ericsson (1974), Amino acid sequencing of 
Streptomyces griseus protease B, A major component ol pronase , Bicchem. Biophys. Res. Comm., 
61:1095-1100; Young, C.L, W.C. Barker, CM. Tomaselli, an<l M.O. Dayhoff (1978), Serine proteases . In 
M.G. Dayhoff (ed.J, Atlas of Protein Sequence and Structure 5, suppi. 3:73-93). These proteases also have 

30 similar tertiary structure, as determined by X-ray crystallography (Oeibaere, L.T.J., W.L.8. Hutoheon, M.N.G. 
James, and W.E Thlessen (5975), Tertiary structural di fferences between microbial serine proteases and 
pancreatic serine enzymes , Nature 257758-763; Fujinaga. M., I..T.J. Delbaere, Q.D. Brayer, and M.N.G, 
James (1935), Refined structure of »-lytio protease af 1.7 A resolution, Analysis of hyrodgen bonding and 
solvent structu re. J. Moi, 6toi, 183:479-562; James, M.N, 6., A.R. Sietecki, G.D. Brayer, Lf.J. Oeibaere, and 

as C.-A. Bauer {1980), Structures of product and inhibitor complexes of Streptomyces griseus protease A at 
1.8. A resolution, J. Mol, Biol., 144:43*88) Although the structures of proteases A end B have been 
extensively studied, the genes encoding these proteases have not been characterized before, EP-A-0 222 
279 discloses signal peptides derived from SSreptomyces. 

to SUMMARY OF THE INVENTION 

in accordance with this invention, the genes encoding protease A and protease B of S, griseus have 
been isolated and investigated to reveal DNA sequences which each direct the secretion of an encoded 
protein fused either directly or indirectly So a signal peptide encoded by the DNA. 
45 According to an aspect of the invention, a recombinant DNA sequence comprises a signal sequence 
and a gene sequence encoding a protein. The recombinant DNA sequence, when expressed in a living cell, 
encodes an amino acid signal peptide with the protein. The signal peptide directs secretion of the protein 
from a cell within which the DNA signal sequence is expressed. 

According to another aspect of the invention, a biologically pure isolated DNA signal sequence encodes 
so a 38 amino acid signal peptide which directs secretion of a recombinant gene-sourced protein linked to 
such 38 amino acid signai peptide, from a ceii in which the DNA signal sequence is expressed. The DNA 
signs! sequence is isolated from Streptomyces p/iseos. 

According to another aspect of the invention, the DNA signai sequence in conjunction with a gene 
sequence encoding a protein is Inserted into a vector, such as a piasmid or a phage, 
55 According to another aspect of the invention, the DNA signal sequence is adapted for expression in a 
living ceil having enzymes catalyzing the formation of disuipbide bonds. 

According to another aspect of the invention, the bioiogicaiiy pure isolated DNA signal sequence of 
Figure 4a. 
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According to another aspect of the invention, the biologically pure isolated ONA signal sequence of 
Figure 5a, 

According to another aspect of She invention, a fused protein is encoded by the recombinant ONA 
sequence of Figure 4 or figure 5, 
s According to another aspect of the invention, a transformed prokaryotic cell is provided which has 
inserted therein a suitable vector including the recombinant DNA encoding the signal protein. The 
transformed prokaryotic ceii may be selected from the Streptcmyees geners. 

According to another aspect of the invention, a biologically pure culture has a transformed prokaryotic 
ceii with the recombinant ONA sequence in a suitable vector. The culture Is capable oi producing, as an 
to intermediate, the fused protein of the amino acid signal peptide and the protein The protein itself is 
produced in a recoverable quantity upon fermentation of the transformed cell in an aqueous nutrient 
medium. The signal peptide directs secretion of the protein from the ceii. 

According io another aspect of the invention, a biologically pure culture, transformed with the functional 
signal sequence as described above, is able to direct the secretion from the ceii of proteins whose 
is bioactivity is dependent upon the formation of correctly positioned intramolecular disulpbide bonds. 

A biologically pure DNA sequence encoding a fused protein Including protease A has the combined 
DNA sequence of Figures 4a, 4b and 4c. 

A biologically pure DNA sequence encoding a fused protein including protease B has the combined 
DNA sequence of Figures 5a, 5b and 5c. 

§^f..Pf.SGBIPTION .OF THK DRAWINGS 

With reference to the Figures, a variety of short forms have been used to identify restriction 
eridcmucieases, amino adds, deoxyribonucleic acids and related inform alien. Standard nomenclature has 
its been used in identifying all of these components as are readily appreciated by those skilled in the art. 
Preferred embodiments of the invention are described with respect to the drawings, wherein: 
Figure 1 illustrates dowrrsfream processing; 

Figure 2 shows restriction endonuclease maps of DNA fragments of sprA and sgrB; 
Figure 3 illustrates restriction endonuclease maps and sequencing strategies in sequencing DNA 
so fragments containing sprA and sprB; 
Figure 4 is the ONA sequence of sprA; 

Figure 4a is the DNA sequence encoding the sgrA (protease A) signal peptide; 
Figure 4b is the DNA sequence encoding the sprA {protease A) propeptide; 
Figure 4c is the DNA sequence encoding mature protease A; 
3S Figure 5 is the DNA sequence of sprB; 

Figure 5a is the DNA sequence encoding the sprB {protease 8) signai peptide; 
Figure 5b is ihe DNA sequence encoding (he sprB (protease B) propeptide; 
Figure 5c is the DNA sequence encoding mature protease B; 

Figure 6 is an alignment of the amino acid sequences deduced from sprA and sprB to develop homology 
m between the fwo sequences. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The organism Strepfomyces griseus is a well recognized microorganism, tt is commercially used for ihe 
46 production of the enzyme Pronase. it is appreciated, however, that this organism also secretes two 
enzymes, protease A and protease B, which are both serine proteases. Although the structure of proteases 
A and B have been extensively studied, the genes encoding these proteins, and the manner in which this 
genetic information is used to signal secretion by the cells, is not understood. According to this invention, 
the genes which encode protease A and protease 8 and provide for the secretion of these proteins in 
so bioaciive form have been discovered. It has been determined that each of protease A and B is included in a 
precursor protein which is processed to remove an amino-terminal polypeptide portion from the mature 
protease, it has further been determined that each of protease A and 8 precursor proteins is enzymattcatly 
processed to form correctly-positioned intramolecular disufphide bonds, which processing is concomitant 
with remove! oi the amino terminal addressing peptide from the mature precursor. The discovered genes, 
as which encode proteases A and B, their intermediate address-competent forms, and their control elements, 
have been designated spr A and sprB. 

As discussed in the following articles, Jurassic L, M.R. Carpenter, 1.8. Smiliie, A. Gertler, S. Levy, and 
L.H. Ericsson (1974), Ammo acid sequencing of Streptomyces gnseus protease B, a major component of 
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pronase.. Bfochem. Biophys, Res. Comm. 61:1095*1100; Young, C.L, W.C. Barter, CM Tomaseifi, and 
M.O. Dayhoff (i 978), Serine proteases, In M.O. Dayhoff (ed->, Atlas of Protein Sequence and Structure 5, 
suppl. 3:73-93, proteases A and B are homologous proteins containing several segments of identical amino 
acid sequence. In accordance with this invention, the genetic code, which makes arid directs the secretion 

s of each of proteases A and B, has identicai ONA sequences corresponding to the regions of identically for 
the homologous proteins proteases A and B. in order to tsoiate the genes, this assumption, that identically 
in portions of the gene sequences would occur, was made so that an oligonucleotide probe could be 
designed from one of the similar regions in the sequences. 

In order to extrapolate the gene sequence which would encode the similar amino acid sequence, the 

io known codon bias for Stregtomyces was relied upon to develop the nucleotide probe (see Bernan, V„ 0. 
Filpuia, W Herber. M. Bibb, and E. Kate (1985), The nucleotide sequence of the tyrosinase gene from 
Sireptomyc.es antibiotics arid characterization of the gene product, Gene 37-101 -1 10: Bibb, M.J , M.J. Bibb, 
J M Ward, S.N. Cohen {1985), Nucleotide sequences encoding and promoting expression of three antibiotic 
resistance genes irtdgeneus to Streptotrsyces.. Mo!. Gen Genet. 199 26-36 {fhompsor;, C.J , and 6.S 

16 Gray {1983}.. Nucleotide sequence of a streptomycete aminoglycoside phosphotransferase gene and its 
rei 3t'P t1sn .iP !o..p!?j^Mr?. n ^!'.^^ encode.:) by re:; stance piasmkis , Proc. Natl. Acad. Set. USA 80:5190- 
5194). Once the probe was constructed, it was then possible to probe the ONA sequences of S. griseus to 
determine if there were any corresponding nucleic acid sequences in the microorganism. Since it was 
known that there were two proteases, A and B, the oligonucleotide probe should have revealed two ONA 

sm fragments detected by hybridization analysis, and in fact, not only did the probe hybridize equally to two 
fragments generated in the genomic library o! S. griseus. font aiso two fragments generated by BamHi 
digest (8.4 kb and 6.8 kb) or Bglil (11 kb and 2,8 kb) were isolated from the genomic library. As a cross- 
check with respect to the predictability of such probe, the same fragments were detected in genomic DNA 
libraries of other isolates of S, griseus. II was rioted, however, that there was no such hybridisation of the 

sb oligonucleotide probe wilh DNA from other Sttsptomyces such as S. lividans. 

Plasmids ware constructed containing digested fragments of S. griseus Trie oligonucleotide probe was 
used to isolate developed plasmids containing spr A and sprB. The screening by use of the probe was 
accomplished by colony blot hybridization where approximately 15.000 E. coil transforrnants containing the 
developed plasmids were screened. Twelve transiorrnanis were detected by the probe and isolated tor 

30 further characterization. These colonies contained two distinct classes oi plasmid based on restriction 
analysis. As determined from the hybridization of genomic DNA, the plasmids contained either the 8.8 kb or 
the 8.4 kb Bam Hi fragment. These fragments contained the sprA and sprB genes. 

The fragments as isolated by hybridization screening were tested for the expression of proteolytic 
activity. With these plasmids identified, such characterization may be accomplished in accordance with a 

as variety of known techniques in accordance with a preferred embodiment of this invention. 

The 6.8 kb and 8,4 kb BamHi fragments were ligated into the Bglf! site of the vector pU702. 
Transforrnants of S, lividans containing these constructions were tested on a mifk plate for secretion of 
proteases, A clear zone, which represented the degradation of the milk proteins, surrounded each 
transformant that contained either BamHi fragment it was noted that the clear zones were not found around 

* ^y.'tfens coionies which contained either pfJ702 only or no plasmid construct. 

Proteolytic activity was also observed when the BamHi fragments were cloned in either orientation with 
respect to the vector, thereby minimizing the possibility of read-through transcription of an incomplete 
protease gene. This observation provides evidence that the two BamHi fragments contain an intact protease 
gene which is capable of effecting secretion in a different Sirepiornyces species, as for example the S. 

45 lividans. With this particularly relevant characterization of the BamHi fragment, and knowing that the desired 
gene was in these fragments, it was possible to isofate and to sequence the genes encoding protease A 
and protease B. 

According to a preferred aspect of this invention, the particular protease gene contained within each 
cloned BamHi fragment was determined by dkteoxy sequencing of the plasmids using the oligonucleotide 

so probe as a primer in such analysis. The 8,4 kb BamHi fragment was found to contain sprB, because a 
poiypeptide deduced from the DNA sequence matched a unique segment of the known amino acid 
sequence of protease B. The 8.8 kb Bam Hi fragment contained the spr A by process of elimination. The 
protease genes in these fragments were localized by digesting the plasmids and determining which of She 
restriction fragments of the plasmids were capable of hybridizing to the ofigonucleotide probe. 

55 Figure 2 shows detailed restriction maps of the 6.8 kb and 8.4 kb BamHi fragments. Hybridization to the 
oligonucleotide probe was confined to a 0.9 kb Pvult-S'.ul fragment of spr A. and a 0.8 kb Pvull-Pviil fragment 
of sprB. Such hybridization is indicated by the heavy fines in Figure 2. Hybridization to the cloned BamHi 
fragments and the 2.8 kb Bgjll fragment of sprB agrees with the hybridization to BamHi and Bglff fragments 
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of genomic ONA, Thus, rearrangment of She BamHI fragments containing the protease genes is unffkely to 
have occurred. 

The functional portions of the sprA- and sprB-contaioing DNA were determined by subcioning restriction 
fragments thereof into pU702. The constructed piasmids were transformed into S. iividans and tested for 

8 proteolytic activity. The 3,2 kb BantHl- Bglfl fragment of sprA and the 2.8 kb Bgfll fragment of sprB, when 
subcioned into pU702 in either orientation, resulted in the secretion of protease from S. tividans . The intact 
protease genes were further delimited to a 1 .9 kb Stet fragment for sprA and a l ,4 kb BssHlf fragment for 
sprB, With reference to Figure 2, each of these functionally active subclones ate indicated below tfte 
restriction maps which contain the region for each gene which hybridized to the oligonucleotide probe. 

to in order to determine the nucleic acid sequence of the protease genes, the 3.2 kb BamHS-Bgfll fragment 
of sprA and the 2.8 kb Bgjll fragment of sprB were subcioned into pUCfB to facilitate further structural 
characterization. As shown in Figure 3, the restriction maps of these subclones and the strategies which 
were used to sequence the 1.4 kb Sail fragment containing sprA and the 1.4 kb BssHli fragment containing 
sprB are shown. The resultant DNA sequences of sprA and sprB are shown in Figures 4 and 5, 

is respectively. The predicted amino acid sequence of protease A differed from the published sequence by 
fhe amidation of amino acid 133, whereas thai of protease B was identical to the published sequence, {see 
Fujinaga. M., L.T.J. Defbaere, '3-D, Brayer, and M.N 6. .tames {1985}. Refined structure of w-lyfic protease 
^l.l.\?... A ..!'.^?M'ffl > Afl alysis gf . ft y'Q'-lgen bonding and solvent structure . J. fviol Biol, f €0:479-502}. 

Analyzing the sequences of Figures 4 and 5, it is apparent that each sequence contains a large open 

so reading frame with the coding region of the mature protease situated at the 3' end. For the protease A and 
protease B genes, the sequence encoding the oarboxy-terrninos of the protease is followed immediately by 
a translation stop codon. At the other end of (he sequence, the predicted amino acid sequences appear to 
extend beyond the ammo-termini of the mature proteases A and B by an additional 1 J 6 amino acids for 
sprA of Figure 4 and 1 1 4 amino acids for sprB of Figure 5. The putative GTG Initiation codons at each of 

a% these positions (-118 for Figure 4; -114 for Figure 5) are each preceded by a potential ribosome binding site 
(as indicated by the series of five dots above the sequence) and followed by a sequence which encodes a 
signal peptide. The processing site for the signal peptidase (identified by fhe tight arrow in Figures 4 and 5) 
is predicted at 38 amino acids from the amine-terminus of fhe putative precursor. [For clarity, that part of 
the nucleic acid: sequences of Figures 4 and 5 corresponding to the signal peptide portion of sprA and sprB 

so is reproduced in Figures 4A and 5A, respectively]. The propeptide is encoded by the remaining sequence 
between the signal processing site (light arrow) and the star! of the mature protein {indicated at the dark 
arrow). [For clarity, that part of nucleic acid sequences of Figures 4 and 5 corresponding to the propeptide 
portion of sprA and sprB is reproduced in Figures 4B and 5B, respectively). The mature protease is 
encoded by the codon sequence 1 through 181 for Figure 4 and 1 through 185 for Figure 5. [For clarify, 

35 that part of the nucleic acid sequences of Figures 4 and 5 corresponding jo the mature protein portion of 
sprA and sprB is reproduced in Figures 4C and 5C, respectively] The amino acid sequence for codons 
-116 through +181 of Figure 4 and the amino acid sequence for codons -114 through +185 of Figure 5, 
when made in She living celt S. griseus, are acted upon in a manner to produce in the culture medium 
externally of the living cells the mature bioactive enzymes protease A and protease B. The processing 

#> involved in accordance with the contained information encoded by that portion of the gene from start of the 
promoter to start of the mature protein in each case included providing a secretory address, the correct 
signal peptide processing site, She necessary propeptide structure not only for secretion but also for correct 
disuiphide bond formation concomitant with secretion, and competent secretion in bioactive form. 

In accordance with ibis invention, the ability of the signal peptide to direct the secretion of bioactive 

46 protein was established by inserting known DNA sequences at the beginning and at the end of known 
sequences. For example, consider the sequence shown in Figure 5. in particular, the promoter and initiator 
ATG of the aminoglycoside phosphotransferase gene, {Thompson, C.J., and G,S. Gray {1983), Nucleotide 
sequence of a streptomycete aminoglycoside phosphotransferase gene and its relationship to phosphotran- 
sferases encoded by resistance plasmids, Proc. Nat!, Acad. Sci. USA, 80:5180-5184} had been inserted 

so preceding the second codon (AQG at -113) of the signal sequence of Figure 5. Due So the Insertion of this 
new promoter and initiator, the sprB gene, now under the control of this non-native promoter, directed both 
elevated levels and earlier expression of proteolytic activity when compared with the unaltered sprB gene. 
The secretion of bioactive protease B in this construction indicated that nucleic acid sequences preceding 
the GTG initiation codon at -114 are not required for the correct secretion of the protease B in bioactive 

as form, provided an active and competent promoter is placed m the precise location indicated. 

In order further to demonstrate the universality of the discovered signal peptide, the sprB coding region 
was replaced with a gene sequence encoding fhe mature amylase from S. griseus. Hence tie nucleic acid 
sequence encoding She amylase was Inserted in place of fhe sequence of Figure 5 to the right of the light 



5 



EP 0 300 468 Bt 



arrow, it was determined that the resulting genetic construction directed the production of an extracellular 
protein having an N-termtnai alanine, properly positioned intramolecular disuiphide bonds, and exhibiting 
amyloiytic. activity at a level comparable to that of a simitar construction with the natural signal peptide of 
amylase. In accordance with this invention, the 38 amino acid signs! peptide of Figures 4 and 4A and 5 and 

s 5A is sufficient to direct the secretion of non-native protein in bioactive form. 

Since both signal sequences encode for the signal peptides of Figures 4 and 4A and 5 and 5A ; the 
organization of the coding regions of sprA and sprB were investigated by comparing the amino acid 
homology of the encoded peptide sequences. Such comparisons are set out in Figure 6 where amino acid 
homology has been compared for the signal peptide of Figure 6a, the propeptide of Figure 6b and the 

jo mature protease of Figure 8c. A summary of such homology is provided in the following Table f. 

TABLE 1 



Homology of sprA ar 


rd sprB Coding Regions 






Length (codons) 


Protein Homology % 


DNA Homology % 


Signal 


38 


50 


58 


Propeptide 


79 


43 


62 


NT protease 3 


87 


46 


58 


CT protease^ 


103 


75 


75 


Total Protease 


190 


81 


S7 


Total coding region 


307 


55 


65 



a ammo-termini of mature proteases (amino adds 1-8?) 
sb b carfooxy-termint of mature proteases (amino acids 88-190} 



The alignment of amino acid sequences translated from the coding regions of the spr A and sprB genes 
indicates an overall homology oi 54% on the basis of amino acid identity. As indicated in Table I, the 
sequence homology is not uniformly distributed throughout the coding region of the spr A and sprB genes. 
The carboxy-terminai domains of the proteases A and B are 75% homologous as noted under the heading 
"CT protease" whereas the average homology for the remainder of the coding region is only 45%, indicated 
under the heading "NT protease". The amino terminal domains containing the signal and propeptide 
regions were similar in both extent of homology and distribution of consensus sequences, as Indicated 
under the headings "signal" and "propeptide*. The unexpectedly high DMA sequence homology relative to 
that of the protein sequences is particularly due to the 61 % conservation in the third position of each codon 
of the sequence. These investigations, revealing the close homology between sprA and sprB genes, 
suggest that both genes originated by duplication of a common ancestral gene. With appropriate care and 
investigation, the commonality of the signal peptides can be determined, thus establishing the cue for 
secretion of proteins and hence providing sufficient information to construct, from the signal DMA of spr A 
and sprB, a single nucleic acid sequence which will be competent to direct protein secretion. 

in accordance with the invention, a recombinant DNA sequence can be developed which encodes for 
desired protein where the expressed protein, in conjunction with the signal peptide and optionally the 
propeptide, provide for secretion of the desired protein in bioactive form. The recombinant DNA sequence 
may be inserted in a suitable vector for transforming a desired ceii for manufacturing the protein. Suitable 
expression vectors may include plasmids and virai phages. As is appreciated by those skilled in the art, the 
bioactlvity of secretory proteins is assured by establishing the correct configuration of intramolecular 
disuiphide bonds. Thus, suitable prokaryotic hosts may be selected for their ability to display enzymatic 
activity of a type iypified fay, but not limited to, that of protein disuiphide oxidoreduciase, EC 5,3.4,1 . 

The particular proton encoded by the recombinant DNA seqence may include eukaryotic secretory 
enzymes, such as prochymosin, chymotrypsim trypsins, amylases, iigninases, chymostn, diastases, lipases, 
and cellulases; prokaryotic secretory enzymes such as glucose, isomerase, amylases, lipases, pectinases, 
cellulases. proteinases, oxidases, lignises; blood factors, such as Factor Vilt and Factor IX and factor VIII- 
related biosyrtthetic blood coagulant proteins; tissue-type plasminogen activator; hormones, such as 
prcinsuisn; iymphokines, such as beta and gamma-interferon, and interieukin-2; enzyme inhibitors, such as 
extracellular proteins whose action is to destroy antibiotics either erszymaticaity or by binding, for example, 
a B-lactamase inhibitor, a-trypsin inhibitor; growth factors, such as organism or nerve growth factors, 
epidermal growth factors, tumor necrosis factors, colony stimulating factors; immunogSobuto-relafsd mol- 
ecules, such as synthetic, designed, or engineered antibody molecules; cell receptors, such as cholesterol 
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receptor; viral molecules, such as viral hemaglutinins, AIDS antigen and immunogen, hepatitis B antigen 
and immimogen, foot-and-mouth disease virus antigen and immunogen; bacterial surface effectors, such as 
protein A; toxins such as protein insecticides, aigicides, fungicides, aid bioeides; and systemic, proteins of 
medical importance, such as myocardial infarct protein <MIP}, weigh! control factor (WCF), catloric -ate 
a protein (CRP) and hirufin {HRD). 

One skilled in the art can easily determine wtiether the use of any known or unknown organism wit! be 
within the scope of this invention In accordance with the above discussion and the following examples. 

Microorganisms which may be useful in this respect as potential prokaryotic expression hosts Include: 
Order: 

to Act: nomy estates; Family: Actinomycefaeeae Genus: Matruchonema, Lactoghera; Family Actinobacteria: 

Genus Actinomyces, Agromyees, Arachina, Arcanobacterium , Afthrohacter, Brevibacteriuro, CeHuloroonas, 

Curtobacteriurn, Micrcbacterium, Oerskovia. Promicromonospora, Rertibacterium, Rothia; Family Ac- 

tinopianetes; Genus Aetioopianes. Oaclytosporangium, Micromonospora; Family Nocardiofarm ac- 

tinomyceies: Genus Caseobacter, Corynebaeterium, Mycobacterium, Nocardia, Rhodococcus; Family Strep- 
?5 tomycei.es: Genus Streptomyces, Streptoverticiiiiitm; Family Maduromycetes: Genus Actinomadura. Excej- 

lospora , fvlicrospora . Pjanospora , Spiriiiospota , Sireptospofangium ; Family Thermqspora : Genus Actinosyu- 

nama, Noeardiopsis, Thermophilla, Family Microspore: Genus Actionospora, Saccharospora: Family Ther- 

moaciiriotnycete*. : Genus J^^^^^AQ^y. 1 ^?^ an< * ^ e °^ er P'Okaryotic genera: AcetMbrjo , Acetotecter ,. 

Achromobacter, Acineiobacier, Aeromonas, Baeterionema, Bifidobacterium, Fievobacterium, Kurtnia, Lac- 
m tobadiius , Leucenostoc , Mycobacteria , Pmpionibacierkim . 

The following species from She genus Strepiosrsyces are identified as particularly suitable as hosts: 

acidophilus, aibus, amvlolyticus. argentioius. aureofaciens, aureus, Candidas, cellostaiiem, celin ulyticus, 

eoeiicoior, creamorus, diastaticus, iariiiosus, fiaveoius, fiavogriseus, fradiae, ftiivoviridis, fungicidicus, 

gelations, glauceseens, globlsporus, griseolus, griseus, hygroscopiens, ligninolyticus. fipoiyticus. iividans, 
n moderatus, olivochromoqenus, parvus, phaaochromogenes . pilcalus, proteoiyticus, rectus, roseolus, 

roseoviojaeeua, scabies, Ihermolyticus, tumorstaticus, yenezueiae, viosacees, vioiaceu^ruber, yjojascens, 

and viridechromogenes. 

aorimycirii 

alboniger 
so ambofaciens 

antibioticus 

asperQiiioide s 

chart re uais 

ciavuligerus 
35 d-astatochrornogenes 

echinates 

erylhraeus 

fendae 

griseofuscus 
m kanamyceticus 

kasugaensis 

koganeiensi s 

lavenduiae 

paryujus 
4s pence it us 

reticuii 

rimosus 

vinacees 

Also, the following eukaryotte hosts are potentially useful In the practice of this invention: 
so Absidta, Acremonjum, AcfOphialopora, Acrosoe-sra, Altemana. Ar lirohotrys, Ascol icha Aureobastdjum, 
Beauveria, Bispora, Bjerhandera. Caiocera. Candida, Cephaliophora, Cephaiosporium, Cerinomyces, 
Cfiaotomium . Chrysosporium , C'rcinejia . Ciaoospor iom , Ciioroastix , Ccccospora , Cochliohoius . Cunnin- 
ghamella, Curvularia, Cusiingophara, Dacrymyces, Dacryopinax, Dendryphion, Dtctosportiim, Doratoroyces, 
prechsiera , Eupeniciiiium . Fiammuliiia , Fusarium , Gliociadium , Gjiornnastix , Giaphiunv Hansenula , 
as Humicola, Hyaiodendron. tearja. Kloeckera, Kiuyveromyces, Lipomyces, Mamrnaria, Merelius, Microascus, 
Monodictys . Mpnosppnum . Morcheiia , Merlierejja . Mucor. Myceiiophthofa , Mycfothecium , Neurespora , 
Oedocephalum, Oidiodendron, Pachysolen, Papulaoa, Papulaspora, Penicliiiurn, Peniophora, Periconia, 
Phaeoconoleiius, Phanerochaete, Phfalophora, Pfptocephais Pleurotus. Preussia. Pycnoporus. Rhiorv 
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cladieila, Rhizoroucor, Rhlzopus, Rhodotorula. Robiiiarda, Saccharomyces, Schwanniomyces, 
Scoiecabasidiwm, Scopufariopsis, Scytalidium, StachybWys, Tetrackiium, Thamnidium, fhermioascus, 
Thermo rnyces, Thicjayia, Toiypociadium, toruia, toruiopsis, tVametes, tricefiuia, frtchoeiadium, 
Trichodefrna , Trtohurtis , Truncatejia , UiQcladiutn , Us til ago , ygrrtcujliuffi, WafdotTiyces , Xylogone . Yarrowta. 
s Preferred embodiments of the invention ate exemplified in the following procedures. Such procedures 
and results are by way of example and are not intended to be in any way limited to the scope of the claims. 

PREPARATIONS 

jo Strains and Piasmids 

Streptomyces griseus (ATCC 15395) was obtained from the American Type Culture Col Section. 
Streptomyces lividans 66 (Bibb, M.J., J. I. Schotisl, and S,N. Cohen (1960), A DNA cloning system for 
interspecies gene transfer in antibiotic-producing Stretomyces, Nature 284:528-53!) and the piasmicis pU61 

16 and pij702 from the John innes Institute; Thompson, C.J, f. Kieser, J.M. Ward, and D.A. Hopwood (1882), 
Physical analysts of antibiotic- resistance genes from St reptomy ces and their use sn vector construct ion , 
Gene 20:51-62; Katz, E., C.J Thompson, and DA Hopwood {1983), Cioning and expression of the 
ly rosinase g one ftom Streptomy ces antibt oticus in .S^eptomy ces I ry itian s , J. Gen. Microbiol., 129:2703- 
3714). E coll strain HB101 (ATCC 33694) was used tot att transformations. Rasmids pUC8, pUCiS and 

ati pUC1 9 were purchased from Bethesda Research Laboratories. 

Growth of Streptomyces mycelium for tie isolation of ONA or the preparation of protoplasts was as 
as described in Hopwood, DA, M.J. Bibb, K.F, Chafer, T. Kieser. C.J, Barton, H.M. Kieser, O.J. Lydiale, CP. 
Smith, J.M, Ward, and H. Schrempf (1985), Genetic Manipulation of Streptomyces, A L aboratory Manual , 
The John Irtnes Foundation, Norwich, UK. Protoplasts of S. iividans were prepared by lysozyme treatment, 
transformed with plasmld ONA, and selected for resistance to thiostreptoo, as described in Hopwood, DA, 
M.J. Bibb, K.F, Chafer, T. Kieser, C.J. Broton, H.M. Kieser, DJ. Lydiale, CP. Smith, J.M. Ward, and H. 
30 Schrempf (1985), Genetic Manipuiatlon of Streptomyces , A Laboratory Manual The John tones Foundation, 
Norwich, UK. Transformers were screened for proteolytic or amyloiytic activity 00 LB plates containing 30 
tig/ml thiostrepiort, and either 1% skim milk or 1% corn starch, respectively E cofi iransfornianls were 
grown on YT medium containing 50 eg/mi ampicilfin. 

as Materials 

Oligonucleotides were synthesized using an Applied Siosysfem 380A ONA synthesizer. Columns, 
phosphoramidifes, and reagents used for oligonucleotide synthesis were obtained from Applied Biosysfems, 
inc. through Technical Marketing Associates, Oligonucieotides were purified by poiyacrytemide gel elec- 
40 trophoresis foflowed by OEAE cellulose chromatography. Enzymes for digesting: and modifying DNA were 
purchased from New England Bioiabs and used according to the supplier's recommendations. 
Radioisotopes |«-32P]dATP ( 3000 Ct/mmoi) and b-32P]ATP (-3000 Ci/mmoi) were from Amsrsham. 
Thiostrepton was donated by Squibb. 

4S EXAMPLE 1 - Isolation of DNA 

Chromosomal DNA was isolated from Streptomyces as described in Chafer, K.F., DA Hopwood, T, 
Kieser, and C.J. Thomson (1982), Gene cloning in Streptomyces, Com Topics Microbiol. Immunol., 98:69- 
95, except that sodium dodecyi sarcosinaie (final cone, 0.5%) was substituted for sodium dodecyl sulfate, 

so Plasmid ONA of transformed S. iividans was prepared by an alkaline lysis procedure as set out in Hopwood, 
DA. MJ. Bibb, K.F, Chafer, T. Kieser, CJ, Barton, H.M, Kieser, DJ. Lydiafe, CP. Smith. J.M. Ward, and H. 
Schrempf (1985), Genetic Manipoiation of .Strepjomyces . A Laboratory .Manual , The John tones Foundation, 
Norwich, UK. Plasmid ONA from E, colt was purified by a rapid boiling method (Holmes, D.S, and M. 
Quigley (1981). A .fap'd. boifi^ Ana!. Biochem.. 114:193- 

55 197}. DNA fragments and vectors used for all constructions were separated by electrophoresis on Sow 
malting point agarose, and purified from the molten agarose by phenol extraction. 
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EXAMPLE 2 - Construction of Genomic Library 

Chromosomal DNA of S. griseus ATCC 15395 was digested to completion of BamHi and fractionated 
by electrophoresis on a 0.8% tow fuelling point agarose gel. DNA fragments ranging in size from 4 to 12 

s kilobase pairs (kb) were isolated from the agarose gel. The piasmid vectors pUClS and pUCid were 
digested with Barn Hi, and treated with calf intestinal alkaline phosphatase (Boehringer Mannheim). The S. 
griseos Bam Hi fragments (0 .3 og) and vectors (0.8 ag> were ligated m a final volume of 20 ui as described 
in Maniatss, T., E.F, Frttsch, and J. Sambrook (1982), Molecular Cloning , A Laboratory Manual , Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY). Approximately 8000 transform ants of HB101 were obtained 

in from: each iigatlon reaction. 

EXAMPL E 3 - Subeloning of Protease Gene Fragments 

A hybrid Streptomyces-E. colt vector was constructed by ligaiing pU702, which had been linearized by 
i$ BamHI, into Site Bam Hi site of pUC8. The unique Bgi il site of this vector was used for suhcioning Bam Hi 
and BgI B fragments fo the protease genes. Other fragments were adapted with Bam HI linkers to facilitate 
ligation into She Bglit site. The hybrid vector, with pUC8 inserted at the BamHI site of pU702, was incapable 
of replicating Strepiqrnyces . However, the E. cojj piasmid could be readily removed prior to transforming S. 
lividans by digestion with BamHI followed by recireuiarizallon with T4 iigase. 

EXAMPLE 4 - Construction for Testing the sprB Sigriai Peptide 

The 0.4 kb SauSAt-Ncol fragment containing the aminoglycoside phosphotransferase gene promoter 
was isolated from pl.J81 and subcloned into the BamHI and NcoS sites of a suitable vector. The Ncol site 

a% containing the initiator ATG was joined to the Mini site of the sprB signal using two 43-mer oligonucleotides, 
which reconstructed the amino-tei minus of the signal peptide. An amylase gene of S. griseus was adapted 
by ligating a 14-mer Pst! linker to a Sma l site in the third codon. This removed she signal peptide and 
restored the amino-terrninus of the mature amylase. Tlx-; Hae ll site of the sprB sigrsai was joined to the Psti 
site of the amylase subclone rising two 28-rner oligonucleotides, which reconstructed the carboxy-terminus 

so of the signal peptide. 

EXAiyiPt. E 5 - Hybridization 

A 20-mer (5TTCCC(C''G)AACAACGACTACGG3') oligonucleotide was designed from an amino acid 
35 sequence (FPNNDYG) which was common to both proteases. For use as a hybridization probe, the 
Oligonucleotide was end-labelled using T4 polynucleotide kinase (New England Biolabs) and [y32PJATP. 
Digested genomic or piasmid DNA was transferred to a Hybood-N nylon membrane (Amersham) by 
eieetroblottiog and hybridized in the presence of formamide <50%) as described in Hopwood, OA, MJ. 
Bibb, K.F. Chafer, T, Kieser, C.J, Barton. KM, Kieser, D.J, lydiate, CP, Smith, J.M. Ward, and H. Schrempf 
40 (1985), Genetic Manipuiafjon The John limes Foundation, Norwich, 

UK. The filters were hybridized with the labelled oligonucleotide probe at 30 "C for 18fi, and washed at 
47 "C. The S. griseus genomic library was screened by colony hybridization as described in Wallace, R,8., 
M.J. Johnson. T. Hirose, T. Miyake, E.H. Kawashima, and K. itakura (198!}, The use of synthetic 
oligonucieotides as hybridization probes, it. Hybridization of otigont jcieotides of mixed sequence to rabbit 
46 giobm DNA, Noel. Acids Res, 9 879-894. 

EXAMPLE 6 - DNA Sequencing 

The sequences of sprA and sprB were determined using a combination of the chemical cleavage 
so sequencing method (ivlaxam, A., and W. Gilbert (1977), A new method I for seguenclng DNA, Proc, Natl. 
Acad. Sci. U.S.A., 74:580-564) and the dideoxy sequencing method (Sanger, F„ S. Nicktert, and A,R. 
Coulson (1977), DNA sequencing with ...chain Jermi^ Proc. Nat!. Acad Sci. U.S.A., 

74:5463:5487). Restriction fragments were end-labeled using either polynucleotide kinase or the large 
fragment of DNA Polymerase i (Amersham), with the appropriate radiolabeled nucleoside triphosphate, 
ss Labeied fragments were either digested with a second restriction endonuctease or strand-separated, 
followed by eleetroeiutlon from a poiyacryfamide gel. Subclones were prepared in the M13 bacteriophage 
and the dideoxy sequencing reactions were run using the -20 universal primer (New England Biolabs). In 
some areas of strong secondary structure, compressions and polymerase failure necessitated the use of 
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sitter inosine (MBis, D R., and F.R. Kramer (1979), Structure independent nucteotlde sequence analysis, 
Proc, Natl Acad Set, U.S.A., 78:2232-2235) or T-deazaguanosirte (Mizusana, S , S. Nishimura, and F. Seeta 
(1986), improvement of the dideoxy chain termination method of ON A sequencing by use of deoxy-7- 
deaaqguaaosine ir iposphate in place of d GTP , Nucieic Acids Res., 14:1319-1324} analogs in She dideoxy 
s reactions to clarify the sequence. The sequence were complied using the software of ONASTAR lm - 
(Doggette, P E., and F.R. Biaftner (1986), Pgrgonaj access of sequ 
Nucfeic Acids Res., 14:611-619). 

Claims 

jo Ciaims for the following Contracting States : AT, BE, CH, DE, FR, GB, GR, IT, U, Lti, NL, SE 

1. The DNA signal sequence of Fig 4 A 

2. The DNA signal sequence of Fig. 5A. 

is 

3. A vector comprising the signal sequence defined in claim 1 or claim 7. and also a sequence encoding a 
desired protein fused thereto. 

4. The vector of claim 3, which is a piasmid or phage. 

;« 

5. A transformed prokaryotic ceil comprising the vector of claim 3 or claim 4, which is capable oi 
expressing said sequences, as a fusion protein. 

8. The cell of claim 5, which is of the genus Sireplomyces. 

SB 

7. The ceii of claim 8, which is S. liviclans or S. gnseus 

8. A method for preparing a desired protein, which comprises culturing She cell of any of claims 5 to 7 in a 
nuirienl medium, the fusion protein being produced as an intermediate and the signal sequence 

30 directing secretion of the desired protein from the celt. 

Ciaims for the foilowiog Contracting State : ES 

1, A process for preparing a DNA vector, comprising introducing the signal sequence of Fig. 4A or Fig. 5A 
as and aiso a sequence encoding a desired protein fused thereto. 

2, The process of claim 1 . wherein the vector is a piasrnid or phage. 

3, A process for preparing a transformed prokaryotic ceii, comprising transformation with the vector of 
«} claim 1 or claim 2, whereby the ceil is capable of expressing said sequences, as a f usion protein. 

4, The process of claim 3, wherein the celf ts of the genus Streptomyces . 

5, The process of claim 4, wherein the ceii is S. iiyidans or S- griseus . 

6, A method for preparing a desired protein, which comprises cuituring the celf of any of claims 3 to 5 in a 
nutrient medium, the fusion protein being produced as an intermediate and the signal sequence 
directing secretion of the desired protein from the ceii. 

so PatentansprUche 

Patentanspriiche fUr foigende Vertragsstaaten : AT, BE, CH, DE, FR, GB, GR, IT, LI, LU, NL, SE 

1. DNA-Signatsequenz von Fig. 4A. 

§5 2, DNA-Signalsequenz von Fig. 5A, 

3. Vektor, der die in Aospruch 1 Oder Anspruch 2 definierts Stgnaisequertz sowie eine darnit verknlipfte 
Sequenz, die fOr ein gswOnschtes Protein kodiert. umfaBt 
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4. Vektor nach Anspruch 3, dec sin Piasroid Oder Phage is*, 

5. Transformierte prokaryotische Zeiie, weiche den Vektor nach Anspruch 3 Oder Anspruch 4 umfafit, die 
imstande 1st, die gensnrtien Sequenzen ais Fusionsprotein ~m expriroierert. 

s 

6. Zeiie nach Anspruch 5 der Gattung Streptorriycss . 

?. Zeiie nach Anspruch 6, die 8. lividans oder S. griseus ist. 

to 8. Verfahren zur Hersteffung sines gewOnschten Proteins, welches das Zu'chten der Zefte nach irgendei- 
nem der Anspruche 5 bis 7 in einern Nahrmediurn umfaBt, wcbei das Fusionsprotein als Zwischenpro- 
dokt erzeugt wird und die Signsisequenz die Sekretiort des gewunsehten Proteins aos der Zeiie steuert. 

PatentansprUche Hit foigenden Vertragsstaat : ES 

is 

1. Verfahren zur Hersisllung ernes DNA-Vekiors, welches das SrtfQhren der Signalsequen>: von Fig. 4A 
Oder Fig. 5A sowis siner damii verknupften Sequent, die fur eiri gewunschtes Protein kodiert, urnfaSt. 

2. Verfahren nach Anspruch 1 , worin ein Vektor ein Ptasmid oder Phage ist. 

3. Verfahren zur Herstellung einer transiorrrfierten prokafyotischen Zeiie, urnfasserid die Transformation 
mtt dem Vektor nach Anspruch 1 oder Anspruch 2. wodurch die Zefie imstande ist. die genannten 
Sequenzen sis Fusionsprotein zu expnmieren. 

a% 4. Verfahren nach Anspruch 3, worin die Zeiie von der Gattung Streptomyces ist 

5. Verfahren nach Anspruch 4 : worin die Zeiie S, fividans oder S griseus ist. 

6. Verfahren zur HersteSSung eines gewOnschten Proteins, weiches das ZOchten der Zeiie each irgendei- 
ao item dec Anspruche 3 bis 5 in einem Nahrmediurn umfaSt, wooer das Fusionsprotein als Zwischertpro- 

dufcf erzeugt wird und die Signaisequenz die Sekrefion des gewOnschten Proteins aus der Zelfe steuert 

Revindications 

Revendications pour las Eiats contractants suivants : AT, BE, CH, DE, FR, GB, GR, IT, U, LU, NL. SE 

1. Sequence d'AON signal de ia figure 4A. 

2, Sequence d'AON signal de la figure 5A. 

40 3. Vecteur comportant ia sequence signal definie en revindication 1 ou revendication 2, ainsi que, 
fusionnee a ce vecteur, one sequence codant pour one protein® vooiue. 

4. Vecteur seion ia revendication 3, qui est uo piasmide ou on phage. 

46 5. Cellule ptocaryofe fransform^e comportant ie vecteur seion fa revindication 3 ou ia revendicafioft 4, 
capable d'exprimer lesditss sequences sous fa forme d'une protein© fusionnee. 

6. Cellule seion ia revendication 5, du genre Streptomyces. 

so 7. Cellule seion ia revendication 6, qui est S. livldans ou S. griseus. 

8. Precede de preparation d'une protein© recherchee, qui compcrte la miss en cuitore de ia cellute seion 
i'une queteonque des revindications 5 a 7 dans un miiieu nuintif, la proteins fusionnee etant produite 
sn tant qu'tnternnediairs, et la sequence signal dirigeant ia secretion de ia proteins recherchee hors de 
as ia cellule. 
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Revendicattons pour 1'Etat contractant suivant ; ES 

1. Precede de preparation d'urs vecteur cf'ADN cornportant ('introduction de la sequence signal de ta figure 
4A ou de is figure 5A, airtsi que, fsjsionnee a ce vecteur. one sequence cedent pout one proteine 

s voolue. 

2. Precede selon ia revendieatiori 5 , dans fequei le vecteur est un ptssmide ou un phage. 

3. Proceda de preparation d'une cellule procaryote transformee, cornportant la transformation avec 1e 
70 vecteur salon ia revendicatSon 1 ou fa revendieatiori 2, grSce auquel la cellule est capable d'exprimer 

fesdites sequences sous la forme d'une proteine fusionnee. 

4. Precede selon la revendication 3, dans tequei fa eeffuie est du genre Streptomyces. 

w 5. Precede selon la revendication 4, dans iequei la cellule est S_ iividans ou S. griseus. 

8. Precede de preparation d'une proteine redwchea, qui comports fa mise en culture de la cellule salon 
I'une quelconqu© des revendteaiions 3 a 5 dans un milieu nutritif, fa proteine fusionnee slant produfte 
en tant qu Intermediate, st ia sequence signal dirigearit la secretion de ia proteine recnerehee hers de 
Kti ia cellule. 
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FIG.3. 
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A 

3TC6ACCCCCATCTCATTCCG(^TCGCGQa:6CSAATCCfiGCCTTa;GTCAGfi6AC65TCCCCSTCAACSATTC 

CAGCQTGCAACTTGGCASGTTCACGCCCACTCCCACTSfiGTGACAACCTCGCGCfiCCAACGGCCCCACCTCACCC 

-H6 

MTFfC8FS?LSSTSR 

iiACCG6GCCGTCCCCCCATACCTCGGA6GATCTCCT8ACCTTCAA6CQCTTCTCGCCGCTCA6CAGCACGTCAAG 
-100 -80 i 

YAKLLAVASGLVAAAALATPSAVAA 
ATATeCACGGCTCCTCGCCGTGGCCrCCGGCCTG5TS6CCGCC6CG6CCCTGGCCACCCCCTCSGCCGTCGCCGC 

~60 

PEAESKATVSGLADASSAi L A A 0 V A 
TCCCGAGGCGGAGTCCAAGGCCACCGTTTCGCAGCTCGCCGACGCCAGCTCCGCCATCCTCGCCGCTGATGTGGC 

-40 

GTAWYTEAST6KI VLTAOS TVSRAE 
G6GCACCGCCT6GTACACGGAGGCGAGCACGGGCAAGATCGTCCTCACCGCCGACAGCACCST6TCGAAGSCCGA 
-20 

LAKVS8ALASSKAKLTYKRAE5KFT 
ACTGGCCAAGGTCAGCAAC3CGCTGGCGGGCTCCAAGGCGAAACTGACG6TCAAGCGCGCCGAG6G€AAGTTCAC 

20 

f*lIASSEAITTSGSRC5tGFN¥SVK 
CCCGCTGATCGC GGGCG6C 6A6GC C A TC ACC ACCGGTGGCAGCCGC TGTTC GCTCGGC TTCAAC GTGTC GGTCAA 

40 

GVAHALTAGHCTNtSASKSfSTRTG 
C6GCGTCGCCCAC6CGCTCACCGCCGGCCACTGCACCAACATCAGCGCCAGCTGGTCCATCGGCAC8C6CACCGG 

60 

T SFPNNOYSI 1RHSNPAAA0GRVYL 
AACCAGCTTCCCGAACAAC6ACTACGGCATCATCCGCCACTCGAACCCGGCGGCGGCCGACGGCCGG6ICTACCT 
80 

Y«GSYQQI?TAGNAF¥GQAVQ8Sa$ 
6T AC AAC GGCTCCTACC A GGAC ATC AC GAC GGCGGSCAAC8CCTTT6T66GGC ASGCC GTCC AGC 6CA6CGGC AG 
500 12 Q 
T TGlfiSGSVTGLNATVMYG SSG E V Y 
CACCACCGGGCTGCGCAGCGGCTCGGTCACCGGCCTCAACGCCACGGTCAACTACGGTTCCAGCGGGATCGTGTA 

540 

G M i QTNVCAEPGDSSGSlFAGSTAl 
CGGCATGATCCAGACCAACGTCTGTGCCGAGCCCGGTSACASTeSASfiCTCGCTCTTCGCGGGCAGCACCGCTCT 

160 

GLTSGGSGfiCfiTGGTTFYQP VTEAL 
GGGTCTCACCTCCGGCGGCAGrGGCAACTGCCGGACCGXGGCACCACGTTCTACCAGCCCGTCACCGAGGCGCT 
181 

SAfGATVL* , ^-^i-O 

GAGCGCCTAC GSGGC AAC G6TCCTGT AGCC GGTGCCACCGGGGCTTC 6GGCTGACC GCC GACC GGC CGCCCGAAS 

CCCCGCGCGACSCCCCACCCCGGCGGACCGTGCTCGCGCGCGGTCCGCCCTCGCCSTGCCACGAACCCCACCGTC 

CTTTCCCCGTCAGGCGCGTSCCffiTCGACCCGCATCGCGMGnGCCGAGAGTGGCCSGCTCGCACCGSCACTGC 

TGMGTCCTGCCCTCGCCCCACGGTCCGGTTCSCGCCCfiCCCGGACGCGGACCCGCGCCTGGGfiAAGCCCICACT 

C AACCCC GT T SCSCGC 6GATGAGG TC GC GA TACCAGGCSAAGGASSCCTTC GG66TQCGGACC TSTGTC TCGT6G 

TCGAC 

FIG.4. 
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FIG.4B. 




PEAESKATVSQLADASSAJLAAOVA 
TCCCGAGGCGGftGTCCAAGGCCACCGTTTCGCASSCTCSCCGACGCCAGCTCCGCCATCCTCfiCCGCTGATGTGGC 

-40 

GTAWYTEASTGK IYLTAQSTVSKAE 
G6GCACCGCCTG6TACACGGAGGCGAGCACGGGCAAGATCGTCCTCACCGCCGACAGCACCGTGTCGAA&GCCGA 
-20 

LAKVSttALAGSKAKLTVKRAEfiKFT 
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FIG.4C. 




GVAHAITAGKCTNISASWS1GTRTG 
CGBC6TCGCCCACtt«TCACCGCC6eCCACT^^ 



60 

TSFPNHOYGI IRilSHPAAAGGRVYL 
AACCAGCTTCCCGAACAACGACTAC6GCATCATCCGCCACTCGAACCCGGCGGCGGCCGACGGCCGS6TCTACCT 
SO 

YHGSYQ0ITTAGNAFVGQAV0R5GS 
GTACAAC6GCTCCTACCA66ACATCAC6ACGGCG6GCAACGCCTTTGTGGGGCAGGCCGTCCAGCGCAGCG6CA6 
100 520 
TTGtRSfiSVTGlNATVUYGSSG f V Y 
CACCACCGGGCTGC6CAGCGGCTCG6TCACCGGCCTCAACGCCACGGTCAACTACG6TTCCAGC6GGATC6T6TA 

540 

GHiQTHYCAEPGDSGGSlfAGSTAt 
CGGCATGATCCAGACCAACGTCTGTGCCGAGCCCGGTGACAGTGGAG6CTCGCTCTTCGCG6GCAGCACCGCTCT 

ISO 

GLTS6SSS»CRT6fitTPYQPVT£At 
G6GTCTCACCTCCGGCGSCAGTG6CAACT6CCG6ACCGGCGGCACCACGTTCTACCAGCCCGTCACCGAG8CGCT 
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8 

CGCTGTGCCCGCCSTGCSCCTTCfiCCfiATCACTTCATCTGCCCSTTCCCGCCCCCGGGCAACACGCTCGCCGCGG 
CGGTTTTGGCGSGQQ^GCGGAACCGGATCGACGCCTGACCCSCGCGASGCCCCACCQGCCCCSGCAGCCGCACGG 



CTCCCGGQtKCGGTGACGGATGTGACCCGCGTGGCCGAAAGSCATICTTGCGTCCCCCGTCCGGCCCCCTCSATA 

CTCCGGtCAGCGATTGTCASGGGCACGGCGAATTCGAAATCCGGACAGSCCCCCGACTGCSCCTCACSGGCCCGC 

-IHv 

--«.. MR I KRTSMRSN 

CACCCCACAGG^GGGCCCCCGATTCCCCTCGGAGGAACCCGAAGTSASaATCAAGCGCACCAGCAACCGCTCGAA 

AARRVRTTAVIAGLAAYAALAVPTA 
CSCGGCGAGAC6CSTCC6CACCACC6CCGTACTCGCGGG6CTCSCCSCC8TCGCGSCGCTGGCCGTTCCCACCGC 

N A t TPRTFSANQLTAAS0AVLSA01 
6AACGCCSAftACCCCCG6GAC6TTCAGTGCCAACCA6CT6ACC8C6fiCGAGCGACGCCGTGCTCGGCGCCGACAT 

ASTAWHIOPQSKRIV¥TV0ST¥SKA 
CGCGGGCACCGCCTGGAACATCGACCCGCAGTCCAAfiCCCCTCfiTCGTCACCGTCGACAGCACGGTCTCGAAGGC 

EINQIKKSASAKftOALRIERTPQtCF 
GGASATCAACCA6ATCMGAAGTCGGCGGGCKCAACGCC^CGC6CTG€G6ATC5AGCGCACCCCC56GAAGTT 

TK l/jSGGOAIYSSTGRCSlSFN^RS 
CACCAA5CTSATCTCCGGCGGCGACGCGATCTAC TCCAGCACCGGACfiCTGCTCGCTCGGCTTCAACGTCCGCAG 

G$TYYFLTAGHCTDGATT««A«SAR 
CGGCA6CACCTACTACTTCCTGACCGCCGGCCACTGCAC8GACGGC6CGACCACCTGG1GGGCGAACTCGGCCCG 

TTYLGTTSGSSFPNNPYGIVRYTNT 
CACCAC66TGCTCGGCAC6ACCTCCGGGTCGAGCTTCCCGAACAACGACTACGGCATCGT5CGCTACACCAACAC 

so 

T 1 PKOGTVGGQO I TSAAfiATYGHAV 
CACCATTCCCMGGACGGCACGGTCGGCGGCCAGGACATCACCAGCGCCSCCAACuCCACCGTCGGCATGGCGGT 

100 

TRRGSTT6TH5GSYTALNATVNYG6 
C ACCC GCC GCG6CTCC ACCACCG6C ACCC AC AGCGGTTCG6TCACCGC ACTC AAC3C C AC CGTC AAC T AC GGSGG 

60YYYGMJ RTMVCAEPGDSGGPtYS 
CG6CGAC6TCGTCTACGGCATGATCCSCACCAACGTGTGCGCGGAGCCCGGCGACTCCGGCGGCCCGCTCTACTC 

lfcO 

GTRA IGLTSGSSSNCSSGGTTFFQP 
CGGCACCCGGGCSATCGGTCTGACCTCCGGCGSCAGCSGCAACTGCTCCTCCGGCGGCACGACCTTCTTCCA6CC 

VTEALSAYGVSVY* 
GGTCACCGAGSCGCTGAGCGCGTACGGCGTCAGCSTSTACTGACCGGCCCCGCCCCGSTCGGGTACGWGCAGTC 



CGTACAMCGTGCCCCCGTCCG6MTTCCGGACSGG6GCTCGCGCTCGCC6GGMGCTCTTGAGAGGATGTCGCC 
ACGACGGGTCGCCGCTGCGC6TC 



nas. 
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FIG, 5 A 
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FI6.5B. 




iCCCCCCGGACGTTCAGT6CCAACCAGCTGACCGCGGCGAGCSACGCC6T6CTCGGCGCCGACAT 



ASTAMNIDPQSKRLVVTyQSTVSKA 
CGCGGGCACCGCCTG6AACArCGACCCGCAGTCCAAGC6CCTC6TC6TCACC6TC6ACA6CACG6TCTC6AA6GC 



-26 
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FIG.5C. 




GSTYYF17ASHCT06ATTWWANSAR 
CGGCAGCACCTACTACTTCCTGACCGCCGGCCACTGCACGGACGGCGCGACCACCTG6TGGGCGAACTCGGCCCG 



TTVtGTTSGSSFP8S0YSI¥8VTNT 
CACCACG6TGCTCGGCACGACCTCCGGGTC5AGCtTCCC6AACMCGACTACSGCATC6TGC6CTACACCAACAC 

T I PK06TVGGQD t TSAANATVSHAV 
CACCATTCCCAAGGACGGCACGGTCG6CGGCCAGGACATCACCAGCSCCGCCMC6CCACCGTCGGCATGGC6GT 
ioo xzo 

TRRGSTTGrHSGSVTALNATVKYGG 
CACCCGCCGCG6CTCCAGCACCG6CACCCACAGCGGTTCG6TCACCGCACTCAACGCCACCGICAACTACGGGGG 

6 0 V V Y G M I RTMVCAEP60SGGPLYS 
CGGCGACGTCGTCTACGGCATGATCCGCACCAACGTGTGCGCGGAGCCCGGCGACTCCGGCG6CCCGCTCTACTC 

iho 

GTftAfGLTSGGSGMCSSGGTTFfQP 
CGGCACCCGGGC(^rC(^rCTGACCTCCGGCG6CAGC6aMCT<^TCCTCCGGCGGCACGACCTTCTTC€AG 
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FIG.6A 

A 

SprA MTFKRFSPLSSTSRYARILAVASGLVAAAAIATPSAVA 
M KR S S R R AV GL A AALA P A A 
sprB MR I KRTSNRSNAARRVRT TAVLAGLAAVAALAVPTANA 

FIG.6B. 

B 

SprA APE AE SKAT VSGIADASSA! IAADVAGTAWYTEASTGKI 

A QL AS A L AO AGTAW 
sprB ETPR T FSAN ■ • QLTAASOAVIGAO I AGTAWN IDPQSKRL 

SprA VLTAOSTVSKAEIAKVSNALAGSKAIC- LTVKRAEGKFTPL 
V T DSTVSKAE AG A t R GKFT L 

SprB VVTVDSTVSKAE INQ1 KKS* AGANADALRIERTPGKFTKl 

FIG.6C. 

C 

SprA I AGGEA I TTGGSRCSIGFNVSVNGVAHAI TAGHCTN I S 

! GG At RCSIGFNV LTAGHCT 

SprB I SGGDAl YSSTGRCSIGFNVRSGSTYYFLTAGHCTOGA 

SprA ASWS I GTRTGTSFPNN0YGI1 RHSNPAAA- 

W GT G SFPNNOYGI R H 

SprB TTWWANSARTTVIGTTSGSSFPNNOYGI VRYTNTT JPK 

SprA DGRVYLYNGSYQDITTAGMAFVGQAVQRSGSTT61RSG 
DG V G 00 IT A NA VG AV R GSTTG SG 
SprB DGTV GG-QDITSAANATVGMAVTRRGS7TG7HSG 

SprA SVTGLNATVNYGSSGl VYGMIQTNVCAEPGDSGGSI FA 

SVT LNATVNYG VYGM1 TNVCAEPGDSGG L 
sprB SVT AL NAT VNYGGGOVVYGHI R TNVCAEPGOSGG PL YS 

SprA GSTAIGLTSGGSGNCRTGGTT FYQPVTEALSAYGATVl 

G A GLTSGGSGNC GGTTF GPVTEALSAYG V 
SprB GTRA I GITSGGSGNCSSGGTTFFQPVTEALSAYGVSVY 
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